Search CORE

600 research outputs found

An Empirical Evaluation of XQuery Processors

Author: Manegold S. (Stefan)
Publication venue: Blackwell Scientific
Publication date: 01/04/2008
Field of study

This paper presents an extensive and detailed experimental evaluation of XQuery processors. The study consists of running five publicly available XQuery benchmarks --- the Michigan benchmark (MBench), XBench, XMach-1, XMark and X007 --- on six XQuery processors, three stand-alone (file-based) XQuery processors (Galax, Qizx/Open, Saxon-B) and three XML/XQuery database systems (BerkeleyDB/XML, MonetDB/XQuery, X-Hive/DB). Next to assessing and comparing the functionality, performance and scalability for the various systems, the major focus of this work is to report in detail about the experiences made while performing such an exhaustive study, to discuss all the problems that we encountered and how we solved them, and hence to hopefully provide some guidelines (or even a recipe) for performing reproducible large-scale experime

CWI's Institutional Repository

Recommended from our members

Ten Hills Farm: The Forgotten History of Slavery in the North

Author: Manegold C. S.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/03/2010
Field of study

ScholarWorks@UMass Amherst

An Empirical Evaluation of XQuery Processors

Author: Manegold S. (Stefan)
Publication venue: CWI
Publication date: 01/01/2007
Field of study

CWI's Institutional Repository

Revolutionary Database Technology for Data Intensive Research

Author: Kersten M.
Manegold S.
Publication venue
Publication date: 01/01/2012
Field of study

International Migration, Integration and Social Cohesion online publications

Efficient resource utilization in shared-everything environments

Author: Manegold S. (Stefan)
Obermaier J.K.
Publication venue: CWI
Publication date: 01/01/1997
Field of study

Efficient resource usage is a key to achieve better performance in parallel database systems. Up to now, most research has focussed on balancing the load on several resources of the same type, i.e. balancing either CPU load or I/O load. In this paper, we present emph{floating probe, a strategy for parallel evaluation of pipelining segments in a shared-everything environment that provides dynamic load balancing between CPU- and I/O-resources. The key idea of floating probe is to overlap---as much as possible with respect to data dependencies---I/O-bound build phase and CPU-bound probe phase of pipelining segments to improve resource utilization. Simulation results show, that floating probe achieves shorter execution times while consuming less memory than conventional pipelining strategies

CWI's Institutional Repository

MonetDB/XQuery: a fast XQuery processor powered by a relational engine

Author: Boncz P.
Grust T.
Keulen M. van
Manegold S.
Rittinger J.
Teubner J.
Publication venue: ACM Press
Publication date: 01/01/2006
Field of study

Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-based encoding of XML documents into relational tables, (ii) a compilation technique that translates XQuery into a basic relational algebra, (iii) a restricted (order) property-aware peephole relational query optimization strategy, and (iv) a mapping from XML update statements into relational updates. Thus, this system implements all essential XML database functionalities (rather than a single feature) such that we can learn from the full consequences of our architectural decisions. While implementing this system, we had to extend the state-of-the-art with a number of new technical contributions, such as loop-lifted staircase join and efficient relational query evaluation strategies for XQuery theta-joins with existential semantics. These contributions as well as the architectural lessons learned are also deemed valuable for other relational back-end engines. The performance and scalability of the resulting system is evaluated on the XMark benchmark up to data sizes of 11GB. The performance section also provides an extensive benchmark comparison of all major XMark results published previously, which confirm that the goal of purely relational XQuery processing, namely speed and scalability, was met

CiteSeerX

Crossref

CWI's Institutional Repository

University of Twente Research Information

Cracking the database store

Author: Kersten M.L. (Martin)
Manegold S. (Stefan)
Publication venue: CWI
Publication date: 01/01/2004
Field of study

Query performance strongly depends on finding an execution plan that touches as few superfluous tuples as possible. The access structures d

CWI's Institutional Repository

Adaptive indexing in modern database kernels

Author: Graefe G.
Idreos S. (Stratos)
Manegold S. (Stefan)
Publication venue: EDBT
Publication date: 01/03/2012
Field of study

Physical design represents one of the hardest problems for database management systems. Without proper tuning, systems cannot achieve good performance. Offline indexing creates indexes a priori assuming good workload knowledge and idle time. More recently, online indexing monitors the workload trends and creates or drops indexes online. Adaptive indexing takes another step towards completely automating the tuning process of a database system, by enabling incremental and partial online indexing. The main idea is that physical design changes continuously, adaptively, partially, incrementally and on demand while processing queries as part of the execution operators. As such it brings a plethora of opportunities for rethinking and improving every single corner of database system design. We will analyze the indexing space between offline, online and adaptive indexing through several state of the art indexing techniques, e. g., what-if analysis and soft indexes. We will discuss in detail adaptive indexing techniques such as database cracking, adaptive merging, sideways cracking and various hybrids that try to balance the online tuning overhead with the convergence speed to optimal performance. In addition, we will discuss how various aspects of modern techniques for database architectures, such as vectorization, bulk processing, column-store execution and storage affect adaptive indexing. Finally, we will discuss several open research topics towards fully automomous database kernels

CWI's Institutional Repository

Big Data

Author: Kersten M.L. (Martin)
Manegold S. (Stefan)
Thanos C.
Publication venue: ERCIM
Publication date: 01/04/2012
Field of study

CWI's Institutional Repository

Self-organizing tuple reconstruction in column-stores

Author: Idreos S. (Stratos)
Kersten M.L. (Martin)
Manegold S. (Stefan)
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/06/2009
Field of study

Column-stores gained popularity as a promising physical design alternative. Each attribute of a relation is physically stored as a separate column allowing queries to load only the required attributes. The overhead incurred is on-the-fly tuple reconstruction for multi-attribute queries. Each tuple reconstruction is a join of two columns based on tuple IDs, making it a significant cost component. The ultimate physical design is to have multiple presorted copies of each base table such that tuples are already appropriately organized in multiple different orders across the various columns. This requires the ability to predict the workload, idle time to prepare, and infrequent updates. In this paper, we propose a novel design, \emph{partial sideways cracking}, that minimizes the tuple rec

CWI's Institutional Repository